We wanted to provide an in-depth analysis of aircraft crashes in the United States from 1980 to 2022, focusing on locations, timings, and consequences, along with exploring causes and weather conditions’ influence on these incidents.
Question: What are the aircraft crashes’ locations, timings, and consequences?
Objective: To discover if any correlations that contributed to aircraft crashes during a certain time period. As well as looking to see if specific locations have higher numbers of crashes than others.
Methodology: Utilization of a detailed dataset from the National Transportation Safety Board, incorporating various data visualization techniques and time-series analyses. We created a heatmap on number of crashes in the different regions, and radial car plot to visualize crashes during certain flight phases.
Findings: Overall, when looking at the crashes through 1980-2020, the number of total fatalities from crashes have decreased. There was a significant jump in 2001 which we concluded was from the 9/11 attacks. After this, the number of fatalities decreased significantly. The heatmap analysis revealed that Alaska, Arizona, Texas, and Florida had the highest number of crashes. The radial bar plot indicated specific flight phases where crashes were more prevalent.
Question: What contributes to the crashes, and does weather significantly impact the increase in aircraft crashes?
Objective: To identify and categorize the most frequent causes of crashes and to correlate crash causes with the severity of outcomes.
Methodology: Using the dataset, we created a bar plot to show the common cause of aircraft crashes, a stacked area chart to show the causes and severity of crashes, and a radar plot to visualize crashes by month and weather conditions.
Findings: Pilot error emerged as the leading cause of crashes, followed by mechanical failures like loss of engine power. The severity of crashes varied with different causes, as shown in the stacked area chart. Weather-related crashes showed distinct patterns in the radar plot, correlating specific weather conditions with increased crash occurrences in certain months.
Analysis:
Question 1: Examining Aircraft Crashes, with a focus on their locations, timings, and consequences
Timeseries analysis of fatalities, and types of injuries.
Total Fatalities Over The Years
Animation of Total Fatalities
Timeseries Animations of Serious Injuries & Minor Injuries.
Animated Combined Timeseries of Aircraft Crashes
Findings
There has been a general decrease in the number of total fatalities from 1980 to 2022. A notable spike in fatalities was observed in 2001, attributed to the 9/11 attacks. Post-2001, a significant decline in fatalities was noted.
Heatmap– on number of crashes in different regions(US map)
Discussion: Looking at the heatmap Alaska, Arizona, Texas, and Florida has the highest number of crashes.
Questions 2: Analysis of Causes of Crashes
Waffle chart
Purpose: The waffle chart shows different causes of aircraft crashes. Pilot’s failure was the highest scoring cause of crashes, followed by loss of engine power.
Findings: - The waffle chart effectively showcased the distribution of crash causes, emphasizing the prominence of human error in aviation incidents. The data highlighted the need for enhanced safety measures and training to address the identified causes of crashes.
Density Plot
Purpose: We plotted the density plot using ggplot’s geom_density function is to visually analyze the distribution of flight crashes over the years based on their probable causes. By utilizing the probable_cause_flights dataset and focusing on the cause_summary column, this visualization aims to provide insights into the changing patterns and trends of aviation incidents. The x-axis represents the years, offering a chronological perspective, while the y-axis portrays the density of crashes associated with specific causes. Here, we have focused on the attribute Pilot's Failure
Discussion: This visual representation allows us to identify clusters of high density, indicating periods or years where certain causes were more prevalent. Additionally, it facilitates the detection of outliers or shifts in patterns, enabling a more nuanced exploration of the dataset. Here, we can see that injuries in particular have reduced overtime with the number of Fatal Injuries reducing significantly over the past few decades.
Assessing the Influence of Weather Conditions on Crashes
Radar Plot
Purpose: We plot the Radar Plot using Plotly for R to comprehensively assess the influence of weather conditions on flight crashes. Leveraging the flights_ntsb_radar dataset which we derived from the original flights_ntsb dataset and categorizing flight crashes based on Visual Meteorological Conditions (VMC) and Instrument Meteorological Conditions (IMC). This visualization seeks to highlight the varying degrees of impact these conditions have on aviation safety. By layering both datasets on a radar plot, we aim to provide a holistic perspective on how different weather scenarios contribute to flight incidents.
Discussion: The radar plot serves as an effective means to showcase the multivariate nature of weather conditions and their relationship with flight crashes. Each axis on the radar represents a specific parameter related to aviation safety, such as visibility, cloud cover, wind speed, and temperature. The radar plot allows for the simultaneous comparison of these parameters for VMC and IMC, unveiling patterns and discrepancies in their respective contributions to incidents.
Radial Bar Plot
Purpose: We plotted the radial bar plot to explore the distribution of flight crashes and associated injuries across different phases of flight. By categorizing flight phases into Landing, Takeoff, Approach, Maneuvering, Climb, and Other this visualization aims to uncover insights into the critical moments during a flight where incidents are more likely to occur. The first graph highlights the count of crashes in each phase, while the second graph focuses on the count of injuries, providing a broad perspective on the safety challenges associated with each phase.
Discussion: The radial bar plot offers an intuitive and visually appealing representation of the distribution of crashes and injuries throughout various phases of flight. In the first graph, the bars radiating from the center depict the count of crashes in each phase, allowing for a quick comparison of their frequencies. This visualization enables the identification of phases that might be particularly prone to incidents, guiding further investigation into the contributing factors.
The second graph, depicting injuries, provides an additional layer of analysis. By comparing the counts of injuries across different flight phases, we can discern whether certain phases are more likely to result in severe consequences. This insight is crucial for understanding the potential risks associated with specific segments of a flight, informing safety measures and protocols.
Conclusion
We can conclude stating that a lot of crashes take place every year but the number of crashes has been decreasing over the past few decades. The number of fatalities has also gone down due to the stringent rules in the Aviation industry. With the fatalities and crashes decreasing over time and more we are moving towards a safer and faster mode of transport which can get us around the globe in a span of a couple hours.
---title: "From Takeoff to Touchdown: Dissecting Data on Air Disasters"subtitle: "INFO 526 - Project Final"author: - name: "Infographic Innovators - Antonio, Bharath, Eshaan, Thanoosha" affiliations: - name: "School of Information, University of Arizona"description: "A shiny app integration with aircraft crash analysis"format: html: code-tools: true code-overflow: wrap embed-resources: trueeditor: visualcode-fold: trueexecute: warning: false echo: false---# Abstract:We wanted to provide an in-depth analysis of aircraft crashes in the United States from 1980 to 2022, focusing on locations, timings, and consequences, along with exploring causes and weather conditions' influence on these incidents.1. Question: What are the aircraft crashes' locations, timings, and consequences? Objective: To discover if any correlations that contributed to aircraft crashes during a certain time period. As well as looking to see if specific locations have higher numbers of crashes than others. Methodology: Utilization of a detailed dataset from the National Transportation Safety Board, incorporating various data visualization techniques and time-series analyses. We created a heatmap on number of crashes in the different regions, and radial car plot to visualize crashes during certain flight phases. Findings: Overall, when looking at the crashes through 1980-2020, the number of total fatalities from crashes have decreased. There was a significant jump in 2001 which we concluded was from the 9/11 attacks. After this, the number of fatalities decreased significantly. The heatmap analysis revealed that Alaska, Arizona, Texas, and Florida had the highest number of crashes. The radial bar plot indicated specific flight phases where crashes were more prevalent.2. Question: What contributes to the crashes, and does weather significantly impact the increase in aircraft crashes? Objective: To identify and categorize the most frequent causes of crashes and to correlate crash causes with the severity of outcomes. Methodology: Using the dataset, we created a bar plot to show the common cause of aircraft crashes, a stacked area chart to show the causes and severity of crashes, and a radar plot to visualize crashes by month and weather conditions. Findings: Pilot error emerged as the leading cause of crashes, followed by mechanical failures like loss of engine power. The severity of crashes varied with different causes, as shown in the stacked area chart. Weather-related crashes showed distinct patterns in the radar plot, correlating specific weather conditions with increased crash occurrences in certain months. ## Analysis:```{r load_packages, message=FALSE, include=FALSE}# GETTING THE LIBRARIESif (!require(pacman))install.packages(pacman)pacman::p_load(tidyverse, dplyr, janitor, dlookr, here, ggpubr, maps, plotly, gganimate, MetBrewer, ggsci, scales, fmsb, gifski, ggimage, ggtext, emojifont, magick, lubridate, patchwork, viridis, usmap, sf, sp, animation)pacman::p_load_gh("BlakeRMills/MoMAColors")``````{r ggplot_setup, message=FALSE, include=FALSE}# setting theme for ggplot2ggplot2::theme_set(ggplot2::theme_minimal(base_size =14, base_family ="sans"))# setting width of code outputoptions(width =65)# setting figure parameters for knitrknitr::opts_chunk$set(fig.width =8, # 8" widthfig.asp =0.618, # the golden ratiofig.retina =1, # dpi multiplier for displaying HTML output on retinafig.align ="center", # center align figuresdpi =200, # higher dpi, sharper imagemessage =FALSE)``````{r load_dataset, include=FALSE}# Reading the data using read_csvflights_ntsb <-read_csv(here("data", "flight_crash_data_NTSB.csv"))probable_cause_flights <-read_csv(here("data","new_flights_PC.csv"))``````{r remove_columns, include=FALSE}# selecting columns which are required for our analysisflights_ntsb <- flights_ntsb |>select( EventType, EventDate, City, State, ReportType, HighestInjuryLevel, FatalInjuryCount, SeriousInjuryCount, MinorInjuryCount, ProbableCause, Latitude, Longitude, AirCraftCategory, NumberOfEngines, AirCraftDamage, WeatherCondition ) |># cleaning column names using janitor packageclean_names()probable_cause_flights["CauseSummary"]<-probable_cause_flights["Probable_Cause"]#removing null valuesprobable_cause_flights <-subset(probable_cause_flights, probable_cause_flights$HighestInjuryLevel!="")probable_cause_flights <- probable_cause_flights |>select( EventType, EventDate, City, State, ReportType, HighestInjuryLevel, FatalInjuryCount, SeriousInjuryCount, MinorInjuryCount, ProbableCause, Latitude, Longitude, AirCraftCategory, NumberOfEngines, AirCraftDamage, WeatherCondition, CauseSummary ) |># cleaning column names using janitor packageclean_names()``````{r data_wrangle, include=FALSE}# Modifying the data using mutate()flights_ntsb <- flights_ntsb |># getting event time from the date time columnmutate(event_time =format(event_date, "%H:%M"),# places the column after event_date.after = event_date) |>mutate(# getting event date from the date time columnevent_date =as.Date(event_date),# extracting flight phase based on NTSB's designated terminologyflight_phase =case_when(grepl("Landing", probable_cause, ignore.case =TRUE) ~"Landing",grepl("Stop", probable_cause, ignore.case =TRUE) ~"Landing",grepl("Approach", probable_cause, ignore.case =TRUE) ~"Approach",grepl("Takeoff", probable_cause, ignore.case =TRUE) ~"Takeoff",grepl("Take-off", probable_cause, ignore.case =TRUE) ~"Takeoff",grepl("Maneuvering", probable_cause, ignore.case =TRUE) ~"Maneuvering",grepl("Climb", probable_cause, ignore.case =TRUE) ~"Climb",grepl("Descent", probable_cause, ignore.case =TRUE) ~"Descent",grepl("Taxi", probable_cause, ignore.case =TRUE) ~"Taxi",grepl("Cruise", probable_cause, ignore.case =TRUE) ~"Cruise",grepl("Hover", probable_cause, ignore.case =TRUE) ~"Hover",grepl("Standing", probable_cause, ignore.case =TRUE) ~"Standing",grepl("Uncontrolled Descent", probable_cause, ignore.case =TRUE) ~"Uncontrolled Descent",grepl("Emergency", probable_cause, ignore.case =TRUE) ~"Emergency",grepl("Holding", probable_cause, ignore.case =TRUE) ~"Holding", ),# places the flight phase column after probable cause.after = probable_cause ) |># getting year and month of the eventsmutate(event_year =year(event_date),event_month =month(event_date) )```### Question 1: Examining Aircraft Crashes, with a focus on their locations, timings, and consequences#### Timeseries analysis of fatalities, and types of injuries.Total Fatalities Over The Years```{r timeseries_data, include=FALSE}# Creating a dataset for the timeSeries plotflights_ntsb_timeseries <- flights_ntsb |>group_by(event_year) |>summarise(total_fatalities =sum(fatal_injury_count, na.rm =TRUE),total_serious_injuries =sum(serious_injury_count, na.rm =TRUE),total_minor_injuries =sum(minor_injury_count, na.rm =TRUE) )#time series plot ----tp <-ggplot(flights_ntsb_timeseries,aes(x = event_year, y = total_fatalities)) +geom_line() +labs(title ="Yearly Aircraft Crash Fatalities",x ="Year",y ="Total Fatalities") +theme_minimal()# Interactive plot with ploty---- total fatalitiesp <-ggplot(flights_ntsb_timeseries,aes(x = event_year, y = total_fatalities)) +geom_line() +geom_text(aes(label ="✈️"),vjust =-0.5,hjust =0.5,size =5) +labs(title ="Yearly Aircraft Crash Fatalities",x ="Year",y ="Total Fatalities")ggplotly(p)```Animation of Total Fatalities```{r time_series_gif}# Creating Animated Plot for Total FaltalitiesImage <-"images/airplane.png"#Save Image instead of inserting emoji# Labelling the axes and adding title to the plot for Fatalitiesp <-ggplot(flights_ntsb_timeseries,aes(x = event_year,y = total_fatalities)) +geom_line() +geom_image(aes(image = Image),size =0.05) +labs(title ="Yearly Aircraft Crash Fatalities",x ="Year",y ="Total Fatalities") +theme_minimal()# Animating the plotanimated_plot <- p +transition_reveal(event_year) +ease_aes('linear') +shadow_mark()# Animating the plotanimate( animated_plot,nframes =200,width =800,height =600,renderer =gifski_renderer())animated_plot```Timeseries Animations of Serious Injuries & Minor Injuries.```{r timeseries_animated, include=FALSE}# Labelling the axes and adding title to the plot for Serious Injuriesp_serious_injuries <-ggplot(flights_ntsb_timeseries,aes(x = event_year,y = total_serious_injuries)) +geom_line() +geom_image(aes(image = Image),size =0.05) +labs(title ="Yearly Aircraft Crash Serious Injuries",x ="Year",y ="Total Serious Injuries") +theme_minimal()# Animating the Aircraft serious injuries plotanimated_serious_injuries <- p_serious_injuries +transition_reveal(event_year) +ease_aes('linear') +shadow_mark()animate( animated_serious_injuries,nframes =200,width =800,height =600,renderer =gifski_renderer())# Plotting a timeseries graph for total minor injuriesp_minor_injuries <-ggplot(flights_ntsb_timeseries,aes(x = event_year,y = total_minor_injuries)) +geom_line() +geom_image(aes(image = Image), size =0.05) +labs(title ="Yearly Aircraft Crash Minor Injuries",x ="Year",y ="Total Minor Injuries") +theme_minimal()# Animating the timeseries plot for total minor injuriesanimated_minor_injuries <- p_minor_injuries +transition_reveal(event_year) +ease_aes('linear') +shadow_mark()animate( animated_minor_injuries,nframes =200,width =800,height =600,renderer =gifski_renderer())# Reshaping the data to a longer formatlong_fcdata <- flights_ntsb_timeseries |>pivot_longer(cols =starts_with("total_"),names_to ="Category",values_to ="Count")# Static plot for 3 animations- interactive# Time series plot with all categoriestp <-ggplot(long_fcdata,aes(x = event_year,y = Count,color = Category)) +geom_line() +labs(title ="Yearly Aircraft Crash Statistics",x ="Year",y ="Count") +theme_minimal()# Interactive plot with plotlyp <-ggplot(long_fcdata, aes(x = event_year,y = Count,color = Category)) +geom_line() +geom_text(aes(label ="✈️"),vjust =-0.5,hjust =0.5,size =5) +labs(title ="Yearly Aircraft Crash Statistics",x ="Year",y ="Count")ggplotly(p)```Animated Combined Timeseries of Aircraft Crashes```{r animated_labels}# all 3 plots with label - animated fcdata <- flights_ntsb_timeseries |>ungroup() |>pivot_longer(cols =starts_with("total_"),names_to ="Category",values_to ="Count") |>mutate(Category =case_when( Category =="total_fatalities"~"Fatalities", Category =="total_serious_injuries"~"Serious Injuries", Category =="total_minor_injuries"~"Minor Injuries" ) )# Creating the animated plotanimation_fcdata <-ggplot(fcdata,aes(event_year, Count,group = Category,color = Category)) +geom_line(size =1.25, show.legend =FALSE) +geom_segment(aes(xend =max(event_year) +0.1,yend = Count),linetype =2,colour ='grey',show.legend =FALSE ) +geom_point(size =2,show.legend =FALSE) +geom_text(aes(x =max(event_year) +0.1,label = Category,color ="#000000" ),hjust =0,show.legend =FALSE ) +geom_vline(xintercept =2001, color ="red", size =1.5) +transition_reveal(event_year) +coord_cartesian(clip ='off') +theme(plot.title =element_text(size =20)) +labs(title ='Aircraft Crash Statistics Over Time',y ='Number of Crashes',x =element_blank())# Animating the plotanimated_fcdata <-animate( animation_fcdata,fps =10,duration =25,width =1200,height =700,renderer =gifski_renderer("images/animated_fcdata.gif"))# Display or save the animated plotanimated_fcdata```FindingsThere has been a general decrease in the number of total fatalities from 1980 to 2022. A notable spike in fatalities was observed in 2001, attributed to the 9/11 attacks. Post-2001, a significant decline in fatalities was noted.#### Heatmap-- on number of crashes in different regions(US map)```{r mapl_plot_data, include=FALSE}# Filtering out NA and invalid Latitude and Longitude valuesflights_ntsb_maps <- flights_ntsb |>subset(!is.na(longitude) &!is.na(latitude)& latitude <75& latitude >10& longitude <-60)# Creating a dataset for the map plot and picking the columns that we needflights_ntsb_maps <- flights_ntsb_maps |># getting total crashes and total injuriesgroup_by(state, event_year) |>summarise(total_crashes =n(),total_injuries =sum(fatal_injury_count, na.rm =TRUE) +sum(serious_injury_count, na.rm =TRUE) +sum(minor_injury_count, na.rm =TRUE) ) |>drop_na()# getting unique states in the dataunique_states <-unique(flights_ntsb_maps$state)# getting all years involved in the dataall_years <-unique(flights_ntsb_maps$event_year)# combining above data so that we will be generating data of the states # which are missing in the original dataall_states_data <-expand.grid(state = unique_states, event_year = all_years)flights_ntsb_maps <-# using left_join() to combine the data based on state and event_yearleft_join(all_states_data, flights_ntsb_maps,by =c("state", "event_year")) |>mutate(total_crashes =ifelse(is.na(total_crashes), 0, total_crashes),total_injuries =ifelse(is.na(total_injuries), 0, total_injuries) )``````{r function_hexbin, include=FALSE}# Creating a function to plot a Hex-Bin mapgetHexBinMap <-function(input_year) { plot_data <- flights_ntsb_maps |>filter(event_year == input_year)plot_usmap(data = plot_data,values ="total_crashes", color ="black") +theme_void() +scale_fill_gradientn(colors =met.brewer("Hokusai2"),name ="Number of Crashes",limits =c(0, 400) ) +labs(title =sprintf("Flight crashes in US states during %d", input_year)) +theme(legend.position ='right',plot.title =element_text(size =40, hjust =0.5, face ="bold", vjust =-1),legend.text =element_text(size =18),legend.title =element_text(size =20, face ="bold") )}``````{r images_yearwise, include=FALSE}# Creating a list of Years from datasetyears_list <- flights_ntsb_maps |>arrange(event_year) |>distinct(event_year) |>pull()for (i inseq_along(years_list)) { my_maps <-paste0("images/map_plot/flight_crash_us_states", years_list[i],".jpg")getHexBinMap(input_year = years_list[i])ggsave( my_maps,height =9,width =15,unit ="in",dpi =200 )}``````{r animation_hexbin}# making gif using gganimate packagehex_bin_maps <-list.files(path ="images/map_plot/", full.names =TRUE)hex_bin_maps_list <-lapply(hex_bin_maps, image_read)# Joining all the saved imagesjoined_plots <-image_join(hex_bin_maps_list)# Animating the images using image_animate() and restting the resolution# Setting fps = 1hex_bin_maps_animation <-image_animate(image_scale(joined_plots, "2000x1000"), fps =2)# Saving gif to the repositoryimage_write(image = hex_bin_maps_animation,path ="images/flight_crash_us_states.gif")hex_bin_maps_animation```Discussion: Looking at the heatmap Alaska, Arizona, Texas, and Florida has the highest number of crashes.### Questions 2: Analysis of Causes of Crashes#### Waffle chart```{r datawragling,include=FALSE}count <- probable_cause_flights |>group_by(cause_summary)|>summarize(perc=n())|>mutate(perc= (perc/sum(perc)) )count_waffle <- count |>mutate(remainder = perc *100-floor(perc *100),floored =floor(perc *100) ) |>arrange(desc(remainder)) |>mutate(number =ifelse(100-sum(floored) >=row_number(), floored +1, floored)) |>arrange(perc)``````{r function-waffle}waffle_plot <-function(number, colour, colour_palette, symbol, symbol_size=8) { p <-expand.grid(x =0:9,y =0:9) %>%rowwise() |>mutate(index =1+sum(x *10+ y >=cumsum(number)),col = colour[[index]]) |>ggplot(aes(x, y, color = forcats::fct_inorder(col))) +geom_text(label = symbol,family ='sans',size = symbol_size) +scale_color_manual(values =moma.colors("Warhol")) +coord_equal() +theme_void()+theme(legend.position ='top',legend.margin =margin(1, 3, 1, 1, unit ='mm'),plot.margin =margin(3,3,3,3,unit ='mm'),legend.background =element_rect(fill ='grey100', color ='grey') )+labs(title="Waffle chart showing different causes of crashes",colour ='Cause')return(p)}``````{r wafflechart}airplane_emoji <-"\U2708"waffle_plot(number = count_waffle$number,colour = count_waffle$cause_summary,symbol ="✈",symbol_size=6)```Purpose: The waffle chart shows different causes of aircraft crashes. Pilot's failure was the highest scoring cause of crashes, followed by loss of engine power.Findings: - The waffle chart effectively showcased the distribution of crash causes, emphasizing the prominence of human error in aviation incidents. The data highlighted the need for enhanced safety measures and training to address the identified causes of crashes.#### Density Plot```{r distribution}ggplot(subset(probable_cause_flights, probable_cause_flights$cause_summary=="pilot's failure")) +geom_density(aes(x=year(event_date), fill=highest_injury_level), alpha=0.8)+scale_fill_manual(values =c("#bcd67c","#82dfe2","#d398ff"))+labs(title="Distribution of crashes caused by pilot's negligence over time",x="Year",y="Density")```Purpose: We plotted the density plot using ggplot's `geom_density` function is to visually analyze the distribution of flight crashes over the years based on their probable causes. By utilizing the `probable_cause_flights` dataset and focusing on the `cause_summary` column, this visualization aims to provide insights into the changing patterns and trends of aviation incidents. The x-axis represents the years, offering a chronological perspective, while the y-axis portrays the density of crashes associated with specific causes. Here, we have focused on the attribute `Pilot's Failure`Discussion: This visual representation allows us to identify clusters of high density, indicating periods or years where certain causes were more prevalent. Additionally, it facilitates the detection of outliers or shifts in patterns, enabling a more nuanced exploration of the dataset. Here, we can see that injuries in particular have reduced overtime with the number of Fatal Injuries reducing significantly over the past few decades.### Assessing the Influence of Weather Conditions on Crashes#### Radar PlotPurpose: We plot the Radar Plot using Plotly for R to comprehensively assess the influence of weather conditions on flight crashes. Leveraging the `flights_ntsb_radar` dataset which we derived from the original `flights_ntsb` dataset and categorizing flight crashes based on `Visual Meteorological Conditions (VMC)` and `Instrument Meteorological Conditions (IMC)`. This visualization seeks to highlight the varying degrees of impact these conditions have on aviation safety. By layering both datasets on a radar plot, we aim to provide a holistic perspective on how different weather scenarios contribute to flight incidents.Discussion: The radar plot serves as an effective means to showcase the multivariate nature of weather conditions and their relationship with flight crashes. Each axis on the radar represents a specific parameter related to aviation safety, such as visibility, cloud cover, wind speed, and temperature. The radar plot allows for the simultaneous comparison of these parameters for VMC and IMC, unveiling patterns and discrepancies in their respective contributions to incidents.```{r radarchart_data, include=FALSE}# Assigning different weather conditions to variables# IFR and IMC are same conditions so we are combining them# VFR and VMC are same conditions so we are combining themflights_ntsb_radar <- flights_ntsb |>mutate(weather_condition =case_when( weather_condition =="IFR"~"IMC", weather_condition =="VFR"~"VMC", weather_condition =="Unknown"~"UNK",is.na(weather_condition) ~"UNK",TRUE~ weather_condition ) ) |># getting total crashes and total injuriesgroup_by(event_month, weather_condition) |>summarise(total_crashes =n(),total_injuries =sum(fatal_injury_count, na.rm =TRUE) +sum(serious_injury_count, na.rm =TRUE) +sum(minor_injury_count, na.rm =TRUE) ) |>arrange(event_month)``````{r radar_chart, include=FALSE}# Filtering out weather condition 'IMC'radar_trace_r1 <- flights_ntsb_radar |>filter(weather_condition %in%c("IMC")) |>pull(total_crashes)radar_trace_r1 <-c( radar_trace_r1, flights_ntsb_radar |>filter(weather_condition =="IMC"& event_month ==1) |>pull(total_crashes) )# Filtering out weather condition 'VMC'radar_trace_r2 <- flights_ntsb_radar |>filter(weather_condition %in%c("VMC")) |>pull(total_crashes)radar_trace_r2 <-c( radar_trace_r2, flights_ntsb_radar |>filter(weather_condition =="VMC"& event_month ==1) |>pull(total_crashes) )radar_theta <-c("Jan" , "Feb" , "Mar" , "Apr", "May", "Jun", "Jul", "Aug", "Sept", "Oct", "Nov", "Dec", "Jan")``````{r radar_chart_plotly}# Plotting an Interactive Radar Plot using Plotlyradar_chart <-plot_ly(type ='scatterpolar',fill ='toself',mode ='lines+markers')# first plot tracing of VMC dataradar_chart <- radar_chart |>add_trace(r = radar_trace_r2,theta = radar_theta,name ='VMC' )# second plot tracing of IMC dataradar_chart <- radar_chart |>add_trace(r = radar_trace_r1,theta = radar_theta,name ='IMC' )# plot layout configurationradar_chart <- radar_chart |>layout(polar =list(radialaxis =list(visible = T,range =c(0, 10500) )))radar_chart```#### Radial Bar PlotPurpose: We plotted the radial bar plot to explore the distribution of flight crashes and associated injuries across different phases of flight. By categorizing flight phases into `Landing, Takeoff, Approach, Maneuvering, Climb, and Other` this visualization aims to uncover insights into the critical moments during a flight where incidents are more likely to occur. The first graph highlights the count of crashes in each phase, while the second graph focuses on the count of injuries, providing a broad perspective on the safety challenges associated with each phase.Discussion: The radial bar plot offers an intuitive and visually appealing representation of the distribution of crashes and injuries throughout various phases of flight. In the first graph, the bars radiating from the center depict the count of crashes in each phase, allowing for a quick comparison of their frequencies. This visualization enables the identification of phases that might be particularly prone to incidents, guiding further investigation into the contributing factors.The second graph, depicting injuries, provides an additional layer of analysis. By comparing the counts of injuries across different flight phases, we can discern whether certain phases are more likely to result in severe consequences. This insight is crucial for understanding the potential risks associated with specific segments of a flight, informing safety measures and protocols.```{r radial_plot, include=FALSE}# Creating a filtered dataset for Radial Plotflights_ntsb_radial <- flights_ntsb |># getting total crashes and total injuriesgroup_by(flight_phase) |>summarise(total_crashes =n(),total_injuries =sum(fatal_injury_count, na.rm =TRUE) +sum(serious_injury_count, na.rm =TRUE) +sum(minor_injury_count, na.rm =TRUE) ) |>drop_na() |>arrange(desc(total_crashes))# getting top 5 flight phases where flight crashes occuredflights_ntsb_radial <- flights_ntsb_radial |>slice(1:5) |>bind_rows( flights_ntsb_radial |>slice(-(1:5)) |>summarize(flight_phase ="Other",total_crashes =sum(total_crashes),total_injuries =sum(total_injuries) ) ) |>mutate(flight_phase =fct_inorder(flight_phase))flights_ntsb_radial``````{r radial_plot_crashes}#| code-fold: true#| code-summary: "Radial Bar Plot - Total Crashes"# Plotting a Radial Barplot to show the Total Count of Crashes and phase of flight flights_radial_bar_crashes <-ggplot(flights_ntsb_radial,aes(x =fct_rev(flight_phase), y = total_crashes, fill = flight_phase)) +geom_bar(stat ="identity", width =0.8) +geom_text(hjust =1.2, size =4.2, aes(y =0, label =comma(total_crashes))) +coord_polar(theta ="y") +labs(x =NULL,y =NULL,fill ="Phase of Flight",title ="A Radial View of Total Crashes",subtitle ="as per the phase of flight",caption ="" ) +scale_y_continuous(breaks =seq(0, 27000, by =5000),limits =c(0, 27000) ) +scale_x_discrete(expand =c(0.35, 0)) +scale_fill_frontiers() +theme(legend.position ="bottom",axis.text =element_blank(),panel.grid.minor =element_blank(),panel.grid.major =element_blank() ) +guides(fill =guide_legend(nrow =1,direction ="horizontal",title.position ="top",title.hjust =0.5,label.position ="bottom",label.hjust =1,label.vjust =1,label.theme =element_text(lineheight =0.25, size =14),keywidth =1.5,keyheight =0.5 ) )flights_radial_bar_crashes# remaining - titles, some texts# interactive - plotly# animate - plotly by years or gganimate bar growth``````{r radial_plot_injuries}#| code-fold: true#| code-summary: "Radial Bar Plot - Total Injuries"# Plotting a Radial Barplot to show the Total Count of Injuries and phase of flight flights_radial_bar_injuries <-ggplot(flights_ntsb_radial,aes(x =fct_rev(flight_phase), y = total_injuries, fill = flight_phase)) +geom_bar(stat ="identity", width =0.8) +geom_text(hjust =1.2, size =4.2, aes(y =0, label =comma(total_injuries))) +coord_polar(theta ="y") +labs(x =NULL,y =NULL,fill ="Phase of Flight",title ="A Radial View of Total Injuries",subtitle ="as per the phase of flight",caption ="" ) +scale_y_continuous(breaks =seq(0, 13200, by =1000),limits =c(0, 13200) ) +scale_x_discrete(expand =c(0.35, 0)) +scale_fill_manual(values =moma.colors("VanGogh") ) +theme(legend.position ="bottom",axis.text =element_blank(),panel.grid.minor =element_blank(),panel.grid.major =element_blank() ) +guides(fill =guide_legend(nrow =1,direction ="horizontal",title.position ="top",title.hjust =0.5,label.position ="bottom",label.hjust =1,label.vjust =1,label.theme =element_text(lineheight =0.25, size =14),keywidth =1.5,keyheight =0.5 ) ) flights_radial_bar_injuries# remaining - titles, some texts# interactive - plotly# animate - plotly by years or gganimate bar growth```## ConclusionWe can conclude stating that a lot of crashes take place every year but the number of crashes has been decreasing over the past few decades. The number of fatalities has also gone down due to the stringent rules in the Aviation industry. With the fatalities and crashes decreasing over time and more we are moving towards a safer and faster mode of transport which can get us around the globe in a span of a couple hours.## Referencesgganimate - https://r-graph-gallery.com/288-animated-barplot-transition.html ploty - https://plotly.com/r/animations/